Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] LS-1263:Sync commits from apache/pulsar branch-3.0 into 3.1_ds #256

Merged
merged 98 commits into from
Apr 23, 2024

Conversation

mukesh-ctds
Copy link
Collaborator

Motivation

Explain here the context, and why you're making that change. What is the problem you're trying to solve.
This PR sync all commits from apache/branch-3.0 into 3.1_ds which are not present.

Modifications

Describe the modifications you've done.

  • Cherry-picked commits from branch-3.0 which are not present on 3.1_ds

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

poorbarcode and others added 30 commits April 19, 2024 18:41
### Motivation

- The task `trim ledgers` runs in the thread `BkMainThreadPool.choose(ledgerName)`
- The task `write entries to BK` runs in the thread `BkMainThreadPool.choose(ledgerId)`

So the two tasks above may run concurrently/

The task `trim ledgers` work as the flow below:
- find the ledgers which are no longer to read, the result is `{Ledgers before the slowest read}`.
- check if the `{Ledgers before the slowest read}` is out of retention policy, the result is `{Ledgers to be deleted}`.
  - if the create time of the ledger is lower than the earliest retention time, mark it should be deleted
  - if after deleting this ledger, the rest ledgers are still larger than the retention size, mark it should be deleted
- delete the`{Ledgers to be deleted}`

**(Highlight)** There is a scenario that causes the task `trim ledgers` did  discontinuous ledger deletion, resulting consume messages discontinuous:
- context:
  - ledgers: `[{id=1, size=100}, {id=2,size=100}]`
  - retention size: 150
  - no cursor there
- Check `ledger 1`, skip by retention check `(200 - 100) < 150`
- One in-flight writing is finished, the `calculateTotalSizeWrited()` would return `300` now.
- Check `ledger 2`, retention check `(300 - 100) > 150`, mark the ledger-2 should be deleted.
- Delete the `ledger 2`.
- Create a new consumer. It will receive messages from `[ledger-1, ledegr-3]`, but the `ledger-2` will be skipped.

### Modifications

Once the retention constraint has been met, break the loop.

(cherry picked from commit 782e91f)
(cherry picked from commit b87c0fb)
…pache#21035)

Motivation: After [PIP-118: reconnect broker when ZooKeeper session expires](apache#13341), the Broker will not shut down after losing the connection of the local metadata store in the default configuration. However, before the ZK client is reconnected, the events of BK online and offline are lost, resulting in incorrect BK info in the memory. You can reproduce the issue by the test `BkEnsemblesChaosTest. testBookieInfoIsCorrectEvenIfLostNotificationDueToZKClientReconnect`(90% probability of reproduce of the issue, run it again if the issue does not occur)

Modifications: Refresh BK info in memory after the ZK client is reconnected.
(cherry picked from commit db20035)
(cherry picked from commit 5f99925)
…ed on topic, when dedup is enabled and no producer is there (apache#20951)

(cherry picked from commit 30073db)
(cherry picked from commit f68589e)
…ata when doing lookup (apache#21063)

Motivation: If we set `allowAutoTopicCreationType` to `PARTITIONED`, the flow of the create topic progress is like the below:
1. `Client-side`: Lookup topic to get partitioned topic metadata to create a producer.
1. `Broker-side`: Create partitioned topic metadata.
1. `Broker-side`: response `{"partitions":3}`.
1. `Client-side`: Create separate connections for each partition of the topic.
1. `Broker-side`: Receive 3 connect requests and create 3 partition-topics.

In the `step 2` above, the flow of the progress is like the below:
1. Check the policy of topic auto-creation( the policy is `{allowAutoTopicCreationType=PARTITIONED, defaultNumPartitions=3}` )
1. Check the partitioned topic metadata already exists.
1. Try to create the partitioned topic metadata if it does not exist.
1. If created failed by the partitioned topic metadata already exists( maybe another broker is also creating now), read partitioned topic metadata from the metadata store and respond to the client.

There is a race condition that makes the client get non-partitioned metadata of the topic:
| time | `broker-1` | `broker-2` |
| --- | --- | --- |
| 1 | get policy: `PARTITIONED, 3` | get policy: `PARTITIONED, 3` |
| 2 | check the partitioned topic metadata already exists | Check the partitioned topic metadata already exists |
| 3 | Partitioned topic metadata does not exist, the metadata cache will cache an empty optional for the path | Partitioned topic metadata does not exist, the metadata cache will cache an empty optional for the path |
| 4 |  | succeed create the partitioned topic metadata |
| 5 | Receive a ZK node changed event to invalidate the cache of the partitioned topic metadata |
| 6 | Creating the metadata failed due to it already exists |
| 7 | Read the partitioned topic metadata again |

If `step-5` is executed later than `step-7`, `broker-1` will get an empty optional from the cache of the partitioned topic metadata and respond non-partitioned metadata to the client.

**What thing would make the `step-5` is executed later than `step-7`?**
Provide a scenario: Such as the issue that the PR apache#20303 fixed, it makes `zk operation` and `zk node changed notifications`  executed in different threads: `main-thread of ZK client` and `metadata store thread`.

Therefore, the mechanism of the lookup partitioned topic metadata is fragile and we need to optimize it.

Modifications: Before reading the partitioned topic metadata again, refresh the cache first.
(cherry picked from commit d099ac4)
(cherry picked from commit 2e534d2)
… many requests (apache#21216)

Motivation: The Pulsar client will close the socket if it receives a `ServiceNotReady` error when doing a lookup. The Broker will respond to the client with a `TooManyRequests` error if there are too many lookup requests in progress, but the Pulsar Proxy responds to the client with a `ServiceNotReady` error in the same scenario.

Modifications: Make Pulsar Proxy respond to the client with a `TooManyRequests` error if there are too many lookup requests in progress.
(cherry picked from commit d6c3fa4)
(cherry picked from commit 9a7c4bb)
…er id when switch ledger (apache#21201)

### Modifications
- Print a warning log if the SSL handshake error
- Print ledger ID when switching ledger

(cherry picked from commit 8485d68)
(cherry picked from commit f243925)
…ing queue is not empty (apache#21259)

Reproduce steps:
- Create a reader.
- Reader pulls messages into `incoming queue`, do not call `reader.readNext` now.
- Trim ledger task will delete the ledgers, then there is no in the topic.
- Now, you can get messages if you call `reader.readNext`, but the method `reader.hasMessageAvailable` return `false`

Note: the similar issue of `MultiTopicsConsumerImpl` has been fixed by apache#13332, current PR only trying to fix the issue of `ConsumerImpl`.

Make `reader.hasMessageAvailable` return `true` when `incoming queue` is not empty.

(cherry picked from commit 6d82b09)
(cherry picked from commit 38c3f0c)
### Motivation

After trimming ledgers, the variable `lastConfirmedEntry` of the managed ledger might rely on a deleted ledger(the latest ledger which contains data).

There is a bug that makes pulsar allow users to set the start read position to an unexisting ledger or a deleted ledger when creating a subscription. This makes the `backlog` and `markDeletedPosition` wrong.

### Modifications

Fix the bug.

(cherry picked from commit 4ee5cd7)
(cherry picked from commit dd28bb4)
…gers (apache#21250)

### Background
- But after trimming ledgers, `ml.lastConfirmedPosition` relies on a deleted ledger when the current ledger of ML is empty.
- Cursor prevents setting `markDeletedPosition` to a value larger than `ml.lastConfirmedPosition`, but there are no entries to read<sup>[1]</sup>.
- The code description in the method `advanceCursors` said: do not make `cursor.markDeletedPosition` larger than `ml.lastConfirmedPosition`<sup>[2]</sup>

### Issue
If there is no durable cursor, the `markDeletedPosition` might be set to `{current_ledger, -1}`, and `async mark delete` will be prevented by the `rule-2` above. So he `backlog`, `readPosition`, and `markDeletedPosition` of the cursor will be in an incorrect position after trimming the ledger. You can reproduce it by the test `testTrimLedgerIfNoDurableCursor`

### Modifications
Do not make `cursor.markDeletedPosition` larger than `ml.lastConfirmedPosition` when advancing non-durable cursors.

(cherry picked from commit ca77982)
(cherry picked from commit 6895919)
…ne faster (apache#21183)

There is an issue similar to the apache#21155 fixed one.

The client assumed the connection was inactive, but the Broker assumed the connection was fine. The Client tried to  use a new connection to reconnect an exclusive consumer, then got an error `Exclusive consumer is already connected`

- Check the connection of the old consumer is available when the new one tries to subscribe

(cherry picked from commit 29db8f8)
(cherry picked from commit b796f56)
…ptions caused by wrong topicName (apache#21997)

Similar to: apache#20131

The master branch has fixed the issue by apache#19841 Since it will makes users can not receive the messages which created in mistake, we did not cherry-pick apache#19841 into other branches, see detail apache#19841)

It works like this:
1. createSubscription( `tp1` )
2. is partitioned topic?
  `no`: return subscriptions
  `yes`: createSubscription(`tp1-partition-0`)....createSubscription(`tp1-partition-n`)

---

```java
String partitionedTopic = "tp1-partition-0-DLQ";

TopicName partition0 = partitionedTopic.getPartition(0);// Highlight: the partition0.toString() will be "tp1-partition-0-DLQ"(it is wrong).The correct value is "tp1-partition-0-DLQ-partition-0"
```

Therefore, if there has a partitioned topic named `tp1-partition-0-DLQ`, the method `PersistentTopics.createSubscription` will works like this:
1. call Admin API ``PersistentTopics.createSubscription("tp1-partition-0-DLQ")`
2. is partitioned topic?
3. yes, call `TopicName.getPartition(0)` to get partition 0 and will get `tp1-partition-0-DLQ` , then loop to step-1.

Then the infinite HTTP call `PersistentTopics.createSubscription` makes the broker crash.

If hits the issue which makes the topic name wrong, do not loop to step 1.

The PR apache#19841 fixes the issue which makes the topic name wrong, and this PR will create unfriendly compatibility, and PIP 263 apache#20033 will make compatibility good.

(cherry picked from commit 4386401)
… reads (apache#22295)

(cherry picked from commit 2803ba2)

(cherry picked from commit fde7c49)
…o same topics (apache#22255)

(cherry picked from commit c616b35)
(cherry picked from commit 5ab0e93)
…cher.consumerList and dispatcher.consumerSet (apache#22270)

(cherry picked from commit cba1600)
(cherry picked from commit 94edfe4)
…tcher.consumerSet (apache#22283)

(cherry picked from commit a52945b)
(cherry picked from commit bec3be2)
…ar-function-go (apache#21444)

(cherry picked from commit 2d57624)
(cherry picked from commit 3ed5ea2)
poorbarcode and others added 22 commits April 19, 2024 20:07
…ed mode and allowOutOfOrderDelivery=true (apache#22533)

(cherry picked from commit 2badcf6)
(cherry picked from commit 53d7848)
…tiered storage (apache#22531)

(cherry picked from commit fbf4cb7)
(cherry picked from commit ff8d3b7)
…onsPerPartitionedTopic<0 (apache#22397)

(cherry picked from commit fb5caeb)
(cherry picked from commit 386f6f0)
Co-authored-by: ceceezhang <[email protected]>
(cherry picked from commit cea1a9b)
(cherry picked from commit b41e752)
)

(cherry picked from commit ffdfc0c)
(cherry picked from commit 42ae91a)
…fective (apache#22490)

(cherry picked from commit 4ca4e28)
(cherry picked from commit 94f1254)
…igurationMetadataStore equals localMetadataStore (apache#22519)

(cherry picked from commit 1dd82a0)
(cherry picked from commit 8f5b825)
(cherry picked from commit 7aedb6b)
(cherry picked from commit beb147c)
…th old pulsar version (apache#22535)

(cherry picked from commit 59daac6)
(cherry picked from commit 8439082)
…esolving original client IP of http/https requests (apache#22524)

(cherry picked from commit 4a88721)
(cherry picked from commit 7d52dd7)
…ing a terminated managed ledger (apache#22552)

(cherry picked from commit 35599b7)
(cherry picked from commit def695b)
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
(cherry picked from commit fc136e8)
… GCP/GCS offloading (apache#22554)

(cherry picked from commit e81f370)
(cherry picked from commit 5c09c20)
…umberOfEntriesInStorage"

This reverts commit e3531e8.

(cherry picked from commit d6791a8)
@mukesh-ctds mukesh-ctds changed the title [improve][broker] LS-1263: Do-not-merge [improve][broker] LS-1263:Sync commits from apache/pulsar branch-3.0 into 3.1_ds Apr 23, 2024
@mukesh-ctds mukesh-ctds marked this pull request as ready for review April 23, 2024 10:04
@mukesh-ctds mukesh-ctds self-assigned this Apr 23, 2024
@srinath-ctds srinath-ctds merged commit 6b7aa50 into 3.1_ds Apr 23, 2024
48 of 50 checks passed
@srinath-ctds srinath-ctds deleted the LS-1263-new branch April 23, 2024 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.